Topic Detection with Hypergraph Partition Algorithm
نویسندگان
چکیده
An algorithm named SMHP (Similarity Matrix based Hypergraph Partition) algorithm is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing Mutual Information (MI) and enhancing the weight of terms in the title. Then Vector Space Model (VSM) is constructed according to terms' weight, and the dimension is reduced by combining H-TOPN and Principle Component Analysis (PCA). Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
منابع مشابه
From Graph to Hypergraph Multiway Partition: Is the Single Threshold the Only Route?
We consider the Hypergraph Multiway Partition problem (Hyper-MP). The input consists of an edge-weighted hypergraph G = (V, E) and k vertices s1, . . . , sk called terminals. A multiway partition of the hypergraph is a partition (or labeling) of the vertices of G into k sets A1, . . . , Ak such that si ∈ Ai for each i ∈ [k]. The cost of a multiway partition (A1, . . . , Ak) is ∑k i=1 w(δ(Ai)), ...
متن کاملK−way Hypergraph Partitioning and Color Image Segmentation
The goal of still color image segmentation is to divide the image into homogeneous regions. Object extraction, object recognition and object−based compression are typical applications that use still segmentation as a low−level image processing. In this paper, we present a method for color image segmentation. It formulates a color image segmentation problem as a partition of a Color Image Neighb...
متن کاملPar kway 2.0: A Parallel Multilevel Hypergraph Partitioning Tool
We recently proposed a coarse-grained parallel multilevel algorithm for the k-way hypergraph partitioning problem. This paper presents a formal analysis of the algorithm’s scalability in terms of its isoefficiency function, describes its implementation in the Parkway 2.0 tool and provides a run-time and partition quality comparison with stateof-the-art serial hypergraph partitioners. The isoeff...
متن کاملHypergraph modelling for geometric model fitting
In this paper, we propose a novel hypergraph based method (called HF) to fit and segment multi-structural data. The proposed HF formulates the geometric model fitting problem as a hypergraph partition problem based on a novel hypergraph model. In the hypergraph model, vertices represent data points and hyperedges denote model hypotheses. The hypergraph, with large and “data-determined” degrees ...
متن کاملA Multi-level Hypergraph Partitioning Algorithm Using Rough Set Clustering
The hypergraph partitioning problem has many applications in scientific computing and provides a more accurate inter-processor communication model for distributed systems than the equivalent graph problem. In this paper, we propose a sequential multi-level hypergraph partitioning algorithm. The algorithm makes novel use of the technique of rough set clustering in categorising the vertices of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JSW
دوره 6 شماره
صفحات -
تاریخ انتشار 2011